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[57] ABSTRACT 

Several improved turbo code apparatuses and methods. The 
invention encompasses several classes: (1) A data source is 
applied to two or more encoders with an interleaver between 
the source and each of the second and subsequent encoders. 
Each encoder outputs a code element which may be trans- 
mitted or stored. A parallel decoder provides the ability to 
decode the code elements to derive the original source 
information d without use of a received data signal corre- 
sponding to d. The output may be coupled to a multilevel 
trellis-coded modulator (TCM). (2) A data source d is 
applied to two or more encoders with an interleaver between 
the source and each of the second and subsequent encoders. 
Each of the encoders outputs a code element. In addition, the 
original data source d is output from the encoder. All of the 
output elements are coupled to a TCM. (3) At least two data 
sources are applied to two or more encoders with an inter- 
leaver between each source and each of the second and 
subsequent encoders. The output may be coupled to a TCM. 
(4) At least two data sources are applied to two or more 
encoders with at least two interleavers between each source 
and each of the second and subsequent encoders. (5) At least 
one data source is applied to one or more serially linked 
encoders through at least one interleaver. The output may be 
coupled to a TCM. The invention includes a novel way of 
terminating a turbo coder. 
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HYBRID CONCATENATED CODES AND 
ITERATIVE DECODING 

ORIGIN OF INVENTION 

The invention described herein was made in the perfor- 
mance of work under a NASA contract, and is subject to the 
provisions of Public Law 96-517 (35 USC 202) in which the 
Contractor has elected to retain title. 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

This invention relates to error correcting codes. 

2. Description of Related Art 

Turbo codes are binary error-correcting codes built from 
the parallel concatenation of two recursive systematic con- 
volutional codes and using a feedback decoder. Recently 
introduced be Berrou, et al. (“Near Shannon limit error- 
correcting coding and decoding: Turbo-codes”, ICC’ 93, 
Conf Rec. pp. 1064-1070, Geneva, May 1993), the basics of 
such codes are described further in U.S. Pat. Nos. 5,446,747 
and 5,406,570. 

The reference and patents to Berrou describe a basic turbo 
code encoder architecture of the type shown in the block 
diagram in FIG. 1. As described in Berrou ’747, FIG. 1 
shows a block diagram of a coder in an example where two 
distinct codes are used in parallel. Each source data element 
d to be coded is coupled to a first systematic coding module 
11 and, through a temporal interleaving module 12, to a 
second systematic coding module 13. The coding modules 
11 and 13 may be of any known systematic type, such as 
convolutional coders, that take into account at least one of 
the preceding source data elements in order to code the 
source data element d. The codes implemented in coding 
modules 11 and 13 may be identical or different. 

The input information bits d feed the first coding module 
11 and, after being scrambled by the interleaving module 12, 
enter the second coding module 13. A codeword of a parallel 
concatenated code consists of the information input bits to 
the first encoder followed by the parity check bits of both 
encoders. 

Under this architecture, there are at least two coded data 
elements Y a and Y 2 , coming from distinct coders 11 and 13, 
associated with each source data element d. A data element 
X, equal to the source data element d, is also transmitted. 
This characteristic was described in Berrou ’747 as “neces- 
sary for the making of the decoding modules”. 

The transmitted coded data elements and source data 
element become received data elements at a decoder. The 
task of the decoder is to re-construct the original data source 
d bit stream from the received data elements, which may 
have been corrupted by noise. 

Thus, an important aspect of prior art turbo code encoders 
is that they transmit a data element X equal to input source 
data element d. 

The present invention results from observation that the 
prior art fails to achieve a simpler architecture for the 
encoder, and fails to provide as robust encoding as is 
required or desired in certain environments, including low- 
power, constrained-bandwidth uses, such as deep space 
communications and personal communication devices, and 
high-noise environments. 

SUMMARY OF THE INVENTION 

The present invention encompasses several improved 
turbo code apparatuses and methods. In a first class of turbo 
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code encoders, a data source d is applied to two or more 
encoders with an interleaver between the data source and 
each of the second and subsequent encoders. Each of the 
encoders outputs a turbo code element which may be 
5 transmitted or stored. A parallel decoder provides the ability 
to decode the turbo code elements to derive the original 
source information d without use of a received data signal 
corresponding to d. The output of the turbo code encoder 
optionally may be coupled to a multilevel trellis-coded 
10 modulator that provides excellent performance. 

In a second class of turbo code encoders, a data source d 
is applied to two or more encoders with an interleaver 
between the data source and each of the second and subse- 
quent encoders. Each of the encoders outputs a turbo code 
15 element. In addition, the original data source d is output 
from the encoder. All of the output elements are coupled to 
a multilevel trellis-coded modulator. 

In a third class of turbo code encoders, at least two data 
sources are applied to two or more encoders with an inter- 
20 leaver between each data source and each of the second and 
subsequent encoders. Each of the encoders outputs a plu- 
rality of turbo code elements which may be transmitted or 
stored. The output of the turbo code encoder optionally may 
be coupled to a multilevel trellis-coded modulator. 

In a fourth class of turbo code encoders, at least two data 
sources are applied to two or more encoders with at least two 
interleavers between each data source and each of the 
second and subsequent encoders. Each of the encoders 
30 outputs a plurality of turbo code elements which may be 
transmitted or stored. The output of the turbo code encoder 
optionally may be coupled to a multilevel trellis-coded 
modulator. 

In a fifth class of turbo code encoders, at least one data 
35 source is applied to one or more serially linked encoders 
through at least one interleaver. 

The invention also encompasses a novel method of ter- 
minating or resetting a turbo coder, and a general parallel 
decoder structure. 

40 The details of the preferred embodiments of the present 
invention are set forth in the accompanying drawings and 
the description below. Once the details of the invention are 
known, numerous additional innovations and changes will 
become obvious to one skilled in the art. 

45 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram of a prior art turbo code encoder. 

FIG. 2 is a block diagram of a general model of a turbo 
50 code encoder having three codes. 

FIG. 3 is a matrix of a weight-4 sequence. 

FIG. 4 is a block diagram of an input trellis termination 
method for turbo code encoders in accordance with the 
present invention. 

55 FIG. 5 is a block diagram of a turbo encoder showing 
output only of encoded parity elements. 

FIG. 6A is a block diagram of a third embodiment of the 
present invention, showing output of multiple encoded ele- 
60 ments derived from multiple input data sources, the use of 
multiple interleavers on at least one data source, and an 
optional multilevel trellis-coded modulator. 

FIG. 6B is a block diagram of a variation of the coder 
shown in FIG. 6A, showing a self -concatenating code. 

65 FIG. 6B2 is a block diagram showing a variation of a 
self-concatenated code, where the encoder has at least one 
input data line d, and d is sent to the modulator. 
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FIG. 6C is a block diagram of a fourth embodiment of the 
present invention, showing output of multiple encoded ele- 
ments derived from multiple input data sources, the use of 
multiple interleavers on at least one data source, and an 
optional multilevel trellis-coded modulator. 5 

FIG. 7A is a block diagram of a serial encoder in 
accordance with the present invention. 

FIG. 7B is a block diagram of a parallel-serial encoder in 
accordance with the present invention. 

FIG. 7C is a block diagram of a serial-parallel hybrid 
encoder in accordance with the present invention. 

FIG. 8 is a diagram showing the performance of various 
turbo codes. 

FIG. 9 is a block diagram of a first rate Vi turbo coder in 15 
accordance with the present invention. 

FIG. 10 is a block diagram of a second rate Vi turbo coder 
in accordance with the present invention. 

FIG. 11 is a block diagram of a rate VS turbo coder in 
accordance with the present invention. 

FIG. 12 is a block diagram of a rate 14 turbo coder in 
accordance with the present invention. 

FIG. 13 is a diagram showing the performance of various 
rate 14 turbo codes. 25 

FIG. 14 is a diagram showing the performance of various 
turbo codes with short block sizes. 

FIG. 15 is a diagram showing the performance of various 
three code turbo codes. 

FIG. 16 is a block diagram of a prior art decoder structure. 30 

FIG. 17 is a block diagram of a parallel decoder structure 
in accordance with the present invention. 

FIG. 18 is a block diagram of a channel model. 

FIG. 19 is a signal flow graph for extrinsic information in 35 
a decoder. 

FIG. 20A is a block diagram of a single parallel block 
decoder in accordance with the present invention. 

FIG. 20B is a block diagram showing a multiple turbo 
code decoder for a three code system, using three blocks 40 
similar to the decoder in FIG. 20 A. 

FIG. 20C is a block diagram showing a multiple turbo 
code decoder for a three code system, using three blocks 
similar to the decoder in FIG. 20A, and having a switchable 45 
serial decoder mode. 

FIG. 20D is a block diagram showing a decoder corre- 
sponding to the self -concatenating coder of FIG. 6B. 

FIG. 20D2 is a block diagram showing a decoder corre- 
sponding to the self -concatenated coder of FIG. 6B2. 50 

FIG. 20E is a block diagram showing a decoder corre- 
sponding to the serial coder of FIG. 7A. 

FIG. 20E2 shows a block diagram of an original and a 
modified MAP algorithm. 

FIG. 20F is a block diagram showing a decoder corre 
sponding to the serial coder of FIG. 7B. 

FIG. 20G is a block diagram showing a decoder corre- 
sponding to the hybrid concatenated code (serial-parallel, 
type II) of FIG. 7C. 6Q 

FIG. 21 is a block diagram of a 16 QAM turbo trellis- 
coded modulation coder in accordance with the present 
invention. 

FIG. 22 is a diagram showing the BER performance of the 
coder shown in FIG. 21. 65 

FIG. 23 is a block diagram of an 8 PSK turbo trellis-coded 
modulation coder in accordance with the present invention. 
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FIG. 24 is a diagram showing the BER performance of the 
coder shown in FIG. 23. 

FIG. 25 is a block diagram of a 64 QAM turbo trellis- 
coded modulation coder in accordance with the present 
invention. 

FIG. 26 is a diagram showing the BER performance of the 
coder shown in FIG. 25. 

FIG. 27 is a block diagram of general embodiment of the 
present invention, showing output of the source data element 
and encoded elements to a multilevel trellis-coded modula- 
tor. 

FIG. 28 is a block diagram showing a general decoder for 
the TCM encoded output of, for example, FIGS. 21, 23, and 
25. 

Like reference numbers and designations in the various 
drawings indicate like elements. 

DETAILED DESCRIPTION OF THE 
INVENTION 

Throughout this description, the preferred embodiment 
and examples shown should be considered as exemplars, 
rather than as limitations on the present invention. 
Overview 

Turbo codes are believed to be able to achieve near 
Shannon-limit error correction performance with relatively 
simple component codes and large interleavers. The present 
invention encompasses several novel designs for turbo code 
encoders and a corresponding decoder that is suitable for 
error correction in high noise or constrained-bandwidth, low 
power uses, such as personal communications systems 
(PCS) applications, where lower rate codes can be used. 

For example, in multiple- access schemes like CDMA 
(Code Division Multiple Access), the capacity (maximum 
number of users per cell) can be expressed as: 


c = 


E b /N 0 


- + 1 


( 1 ) 


where r| is the processinggain and E fc /N 0 is the required 
signal-to -noise ratio to achieve a desired bit error rate (BER) 
performance (E fc : energy received per useful bit; N 0 : mono- 
lateral spectral density of noise). For a specified BER, a 
smaller required E fc /N 0 implies a larger capacity or cell size. 
Unfortunately, to reduce E^/N 0 , it is necessary to use very 
complex codes (e.g., large constraint length convolutional 
codes). However, the present invention includes turbo codes 
that are suitable for CDMA and PCS applications and which 
can achieve superior performance with limited complexity. 
For example, if a (7, Vi) convolutional code is used at 
BER=10 -3 , the capacity is C=0.5r|. However, if two (5, Vi) 
punctured convolutional codes or three (4, Vi) punctured 
codes are used in a turbo encoder structure in accordance 
with the present invention, the capacity can be increased to 
C=0.8r| (with 192-bits and 256-bits interleavers, which 
correspond to 9.6 Kbps and 13 Kbps with roughly 20ms 
frames). Higher capacity can be obtained with larger inter- 
leavers. Note that low rate codes can be used for CDMA 
since an integer number of chips per coded symbol are used 
and bandwidth is defined mainly by chip rate. 
Implementation 

The invention may be implemented in hardware or 
software, or a combination of both. In the preferred 
embodiment, the functions of a turbo coder and decoder 
designed in conformance with the principals set forth herein 
are implemented as one or more integrated circuits using a 
suitable processing technology (e.g., CMOS). 
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As another example, the invention may be implemented 
in computer programs executing on programmable comput- 
ers each comprising a processor, a data storage system 
(including volatile and non-volatile memory and/or storage 
elements), at least one input device, and at least one output 
device. Program code is applied to input data to perform the 
functions described herein and generate output information. 
The output information is applied to one or more output 
devices, in known fashion. 

Each such program may be implemented in a high level 
procedural or object oriented programming language to 
communicate with a computer system. However, the pro- 
grams can be implemented in assembly or machine 
language, if desired. In any case, the language may be a 
compiled or interpreted language. 

Each such computer program is preferably stored on a 
storage media or device (e.g., ROM or magnetic disk) 
readable by a general or special purpose programmable 
computer, for configuring and operating the computer when 
the storage media or device is read by the computer to 
perform the procedures described herein. The inventive 
system may also be considered to be implemented as a 
computer-readable storage medium, configured with a com- 
puter program, where the storage medium so configured 
causes a computer to operate in a specific and predefined 
manner to perform the functions described herein. An 
example of one such type of computer is a personal com- 
puter. 

Turbo Code Encoders 

Following is a discussion of several general consider- 
ations in designing turbo code encoders and decoders in 
accordance with the present invention. Since these consid- 
erations pertain to the novel designs described below as well 
as prior art designs in some cases, a simple 3 -code encoder, 
as shown in FIG. 2, will be used as an initial example. 
General Structure of an Encoder 

In FIG. 2, the turbo code encoder contains three recursive 
binary convolutional encoders, with M 1? M 2 and M 3 
memory cells (comprised of the delay gates D shown in each 
encoder) respectively. In general, the three component 
encoders may not be identical and may not have identical 
code rates. The information bit sequence u=(u 1 , . . . u^) of 
length N is applied to the component Encoder 1 through 
interleaver % (normally set to the identify function), which 
outputs a sequence u 1 . The component Encoder 1 produces 
two output sequences, x 1 - and x lp (the subscript i stands for 
“information” bits, while the subscript p stands for “parity” 
bits). The component Encoder 2 operates on a reordered 
sequence of information bits, u 2 , produced by an interleaver 
(also known as a permuter), jt 2 , of length N, and outputs the 
sequence x 2p . The component Encoder 3 operates on a 
reordered sequence of information bits, u 3 , produced by an 
interleaver, Jt 3 , of length AT, and outputs the sequence x 3 . 
Similarly, subsequent component encoders operate on a 
reordered sequence of information bits, Uy, produced by 
interleaver n and output the sequence x jp . 

In the preferred embodiment, each interleaver is a pseudo- 
random block scrambler defined by a permutation of N 
elements with no repetitions. That is, a complete block is 
read into an interleaver and read out in a specified (fixed) 
random order. 

In general, a decoder (discussed more fully below) 
receives the transmitted sequences x u and x Jp , as received 
sequences y y. As noted above, the task of the decoder is to 
re-construct the original data source d bit stream from the 
received data elements, which may have been corrupted by 
noise. In the present invention, the encoder does not need to 
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transmit the original data sequence. If one or more encoder 
outputs, including possibly x 1£ , is punctured (not 
transmitted) based on a predetermined pattern, the punctured 
positions will be filled with erasures at the receiver. 

FIG. 2 shows an example where a rate r=l/n= 1 A code is 
generated by three component codes with M 1 =M 2 =M 3 =M= 
2, producing the outputs: 

10 x l; =u 

Xl^U-g Jg a 
^2.p=^1%tJga 
^3p=^3-glJga 

is assumed to be an identity, i.e., no permutation), where 
the generator polynomials g a and g b have octal representa- 
tion (7) octal and (5) octal , respectively. Note that various code 
rates can be obtained by proper puncturing of x lp , x 2p , x 3p , 
20 and even x 1£ if a decoder in accordance with the present 
invention is used (see below). 

Design of Preferred Constituent Encoders 

25 A design for constituent convolutional codes, which are 
not necessarily optimum convolutional codes, was originally 
reported in S. Benedetto and G. Montorsi, “Design of 
Parallel Concatenated Convolutional Codes” (to be pub- 
lished in IEEE Transactions on Communications, 1996) for 
30 rate 1/n codes. We extend those results to rate b/n codes. It 
has been suggested (without proof) that good random codes 
are obtained if g a is a primitive polynomial. This suggestion, 
used in the report cited above to obtain “good” rate Vi 
constituent codes, will be used in this article to obtain 
35 “good” rate Vs, 2 /s, %, and Vs constituent codes. By “good” 
codes, we mean codes with a maximum effective free 
distance d ef , that is, those codes that maximize the minimum 
output weight for weight-2 input sequences (because this 
weight tends to dominate the performance characteristics 
40 over the region of interest). 

Maximizing the weight of output codewords correspond- 
ing to weight-2 data sequences gives the best BER perfor- 
mance for a moderate bit signal- to -noise ratio (SNR) as the 
45 random interleaver size N gets large. In this region, the 
dominant term in the expression for bit error probability of 
a turbo code with q constituent encoders is: 



where dT j2 is the minimum parity- weight (weight due to 
55 parity checks only) of the codewords at the output of the jth 
constituent code due to weight-2 data sequences, and (3 is a 
constant independent of N. Define d / 2 =cF / 2 +2 as the mini- 
mum output weight including parity and information bits, if 
™ the ith constituent code transmits the information 
(systematic) bits. Usually one constituent code transmits the 
information bits (j=l), and the information bits of other 
codes are punctured. Define d e j=Z q j=1 &" j2 +2 as the effective 
free distance of the turbo code and l/N' 7-1 as the “interleav- 
es er’s gain.” We have the following bound on c F j2 for any 
constituent code: for any r=b/(b+l) recursive systematic 
convolutional encoder with generator matrix: 
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G = 


hi(D) 

h 0 (D) 

h 2 {P ) 

/*>x£ ^o(-O) 


MP) 

Ao(«) 


(3) 


5 


10 


where I fcxfc is the identity matrix, deg[h I -(D)^m, h I (D)^h 0 (d), 
i=l,2, . . . ,b, and h 0 (D) is a primitive polynomial of degree 
m, the following upper bound holds: 


d-2 ^ 



(4) 


15 


A corollary of this is that, for any r=b/n recursive sys- 2 q 
tematic convolutional code with b inputs, b systematic 
outputs, and n-b parity output bits using a primitive feedback 
generator, we have: 


TABLE 3 


Best rate 4/5 constituent codes. 


k 


Code generator 


d 2 

d 3 

Alin 

4 

h 0 = 13 

^ = 15 

h 2 = 17 

h 3 = 11 

h 4 = 7 

4 

3 

3 


h 0 =13 

^ = 15 

h 2 = 17 

h 3 = 11 

h 4 = 5 

4 

3 

3 

5 

h 0 = 23 

h ± = 35 

h 2 = 33 

h 3 = 37 

h 4 = 31 

5 

4 

4 


h 0 = 23 

N = 35 

h 2 = 21 

h 3 = 37 

h 4 = 31 

5 

4 

4 


h 0 = 23 

N = 35 

h 2 = 21 

h 3 = 37 

h 4 = 31 

5 

4 

4 


B. Best Punctured Rate Vi Constituent Codes 

A rate 2 h constituent code can be derived by puncturing 
the parity bit of a rate Vi recursive systematic convolutional 
code using, for example, a pattern P=[10]. A puncturing 
pattern P has zeros where parity bits are removed. 

Consider a rate Vi recursive systematic convolutional code 
(l,g 1 (D)/(g 0 (D)). For an input u(D), the parity output can be 
obtained as: 


u(D) gl (D) 

x{D) = 

go(D) 


( 6 ) 


d-2 ^ 


(n - b) 2 m ~ l 
b 


+ 2 (n - b) 


( 5 ) 25 


There is an advantage to using b>l, since the bound in the 
above equation for rate b/bn codes is larger than the bound 30 
for rate 1/n codes. Examples of codes that meet the upper 
bound for b/bn codes are set forth below. 

A. Best Rate b/b+1 Constituent Codes 

We obtained the best rate 2 h codes as shown in Table 1, 

or 

where d 2 =cF 7 - 2 +2. The minimum -weight codewords corre- 
sponding to weight-3 data sequences are denoted by d 3 , d min 
is the minimum distance of the code, and k=m+l in all the 
tables. By “best” we mean only codes with a large d 2 for a 
given m that result in a maximum effective free distance. We 
obtained the best rate % codes as shown in Table 2 and the 40 
best rate Vs codes as shown in Table 3. 


We would like to puncture the output x(D) using, for 
example, the puncturing pattern P[10] (decimation by 2) and 
obtain the generator polynomials h 0 (D), h 1 (D), and h 2 (D) for 
the equivalent rate 2 /3 code: 


G = 


1 0 
0 1 


hi(D) 

ho(D) 

h 2 {P) 

h 0 (P) 


(7) 


We note that any polynominal f(D)=2a,-D*, a,-eGF(2), can 
be written as: f(D=f 1 (D 2 )+Df 2 (D 2 ), where f^D 2 ) corre- 
sponds to the even power terms of f(D), and Df 2 (D 2 ) 
corresponds to the odd power terms of f(D). Now, if we use 
this approach and apply it to u(D), g-^D), and g 0 (D), then we 
can rewrite the equation for x(D) as: 


TABLE 1 


k 

Best rate 2/3 constituent codes. 
Code generator d 2 

A 

Alin 

3 


h 4 = 3 

in 

II 

V 

4 

3 

3 

4 

h 0 = 13 

N = 15 

h 2 = 17 

5 

4 

4 

5 

h 0 = 23 

hi = 35 

h 2 = 27 

8 

5 

5 


ii 

hi = 35 

h 2 = 33 

8 

5 

5 

6 

h 0 = 45 

hi = 43 

h 2 = 61 

12 

6 

6 


TABLE 2 

k 


Best rate 3/4 constituent codes. 
Code generator d 2 

A 

An in 

3 

h 0 = 7 

hi = 5 

h 2 = 3 

h 3 = 1 

3 

3 


h 0 = 7 

hi = 5 

h 2 = 3 

h 3 = 4 

3 

3 


h 0 = 7 

hi = 5 

h 2 = 3 

h 3 = 2 

3 

3 

4 

h 0 =13 

^ = 15 

h 2 = 17 

h 3 = 11 

4 

4 

5 

h 0 = 23 

hi = 35 

h 2 = 33 

h 3 = 25 

5 

4 


h 0 = 23 

hi = 35 

h 2 = 27 

h 3 = 31 

5 

4 


h 0 = 23 

hi = 35 

h 2 = 37 

h 3 = 21 

5 

4 


ii 

hi = 27 

h 2 = 37 

h 3 = 21 

5 

4 


45 


50 


55 


60 


65 


xi(D 2 ) + Dx 2 (P) 2 


(«i ( D 2 ) + Pu 2 (P 2 ))(gn(D 2 ) + Pg l 2 (P 2 )) 
goi (P 2 ) B Pgo 2 (P 2 ) 


( 8 ) 


where x 1 (D) and x 2 (D) correspond to the punctured output 
x(D) using puncturing patterns P[10] and P[01], respec- 
tively. If we multiply both sides of the above equation by 
(g 01 (D 2 )+Dg 02 (D 2 ) and equate the even and the odd power 
terms, we obtain two equations in two unknowns, namely 
x 1 (D) and x 2 (D). For example, solving for x 1 (D), we obtain: 


x l (P) = u l (D) 


hiiP) 
ho(D) ' 


u 2 (P ) 


h 2 (P) 
ho(P ) 


(9) 


where h 0 (D)=g 0 (D) and: 


hi(D)=gu(D)goi(D)+Dgi2(D)go2(D) (10) 

A (P) = g 12 (P)go 1 (P) + Pgll (P)go 2 (P) (11) 

From the second equation above, it is clear that h 2 o =0. A 
similar method can be used to show that for P[01] we get 
h 2 m =0. These imply that the conditions that h,- 0 =l and h j m =l 
will be violated. Thus, we have the following theorem: if the 
parity puncturing pattern is P=[10] or P=[01], then it is 
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impossible to achieve the upper bound on d 2 =cF / - 2 +2 for rate 
2 /3 codes derived by puncturing rate Vi codes. 

The best rate V 2 constituent codes with puncturing pattern 
P=[10] that achieve the largest d 2 are given in Table 4. 

TABLE 4 


5 


Best rate 1/2 punctured constituent codes. 


k 

Code generator 

d 2 

^3 

dm in 

3 

go = ? 

gi-5 

4 

3 

3 

4 

go =13 

gi = 15 

5 

4 

4 

5 

go = 23 

gi = 37 

7 

4 

4 


go = 23 

gi-31 

7 

4 

4 


go = 23 

gi = 33 

6 

5 

5 


go = 23 

gi = 35 

6 

4 

4 


go = 23 

gi = 27 

6 

4 

4 


C. Best Rate 1/n Constituent Codes 

As known in the art, for rate 1/n codes, the upper bound 
for b=l reduces to: 20 

^.^(«-l)(2'"- 1 + 2) (12) 

Based on this condition, we have obtained the best rate V3 
and V* codes without parity repetition, as shown in Tables 5 25 
and 6, where d 2 =dC 2 =2 represents the minimum output 
weight given by weight-2 data sequences. The best non- 
punctured rate Vi constituent codes have been reported by S. 
Benedetto et al., supra. 


TABLE 5 


k 

Best rate 1/3 constituent codes. 
Code generator d 2 

^3 

4nin 

2 

go = 3 

gi = 2 

g 2 = 1 

4 

00 

4 

3 

go = 7 

gi = 5 

g 2 = 3 

8 

7 

7 

4 

go =13 

gi-17 

g 2 = 16 

14 

10 

10 

5 

11 

& 

gi = 33 

g 2 = 37 

22 

12 

10 


11 

& 

gi = 25 

g 2 = 37 

22 

11 

11 


TABLE 6 

k 


Best rate 1/4 constituent codes. 
Code generator 

^2 

^3 

Aiin 

4 

go = 13 

gi-17 

g 2 = 15 

g 3 = H 

20 

12 

12 

5 

go = 23 

gi = 35 

g 2 = 27 

g 3 = 37 

32 

16 

14 


go = 23 

gi = 33 

g 2 = 27 

g 3 = 37 

32 

16 

14 


go = 23 

gi = 35 

g 2 = 33 

g 3 = 37 

32 

16 

14 


go = 23 

gi = 33 

g 2 = 37 

g 3 = 25 

32 

15 

15 


35 


40 


45 


50 


General Interleaver Design Considerations 

In order to estimate the performance of a code, it is 
necessary to have information about its minimum distance, 55 
weight distribution, or actual code geometry, depending on 
the accuracy required for the bounds or approximations. The 
challenge is in finding the pairing of codewords from each 
individual encoder, induced by a particular set of interleav- 
ers. We have found that it is best to avoid joining low-weight 60 
codewords from one encoder with low-weight words from 
the other encoders. In the example of FIG. 2, the component 
codes have minimum distances 5, 2, and 2. This will produce 
a worst-case minimum distance of 9 for the overall code. 
Note that this would be unavoidable if the encoders were not 65 
recursive since, in this case, the minimum weight word for 
all three encoders is generated by the input sequence 


u=(00 . . . 0000100 . . . 000) with a single “1”, which will 
appear again in the other encoders, for any choice of 
interleavers. This motivates the use of recursive encoders, 
where the key ingredient is the recursiveness and not the fact 
that the encoders are systematic. For this example, the input 
sequence u=(00 . . . 00100100 . . . 000) generates a 
low- weight codeword with weight 6 for the first encoder. If 
the interleavers do not “break” this input pattern, the result- 
ing codeword’s weight will be 14. In general, weight-2 
sequences with 2+3t zeros separating the l’s would result in 
a total weight of 14+6t if there were no permutations. By 
contrast, if the number of zeros between the ones is not of 
this form, the encoded output is nonterminating until the end 
of the block, and its encoded weight is very large unless the 
sequence occurs near the end of the block. 

With permutations before the second and third encoders, 
a weight-2 sequence with its l’s separated by 2+3t,- zeros 
will be permuted into two other weight-2 sequences with l’s 
separated by 2+3t,- zeros, where i=2, 3, and where each t,- is 
defined as a multiple of V3. If any t,- is not an integer, the 
corresponding encoded output will have a high weight 
because then the convolutional code output is nonterminat- 
ing (until the end of the block). If all t/s are integers, the 
total encoded weight will be 14+22 3 ,-_ 1 t,-. Thus, one of the 
considerations in designing the interleaver is to avoid integer 
triplets (t 1? t 2 , t 3 ) that are simultaneously small in all three 
components. In fact, it would be nice to design an interleaver 
to guarantee that the smallest value of 2 3 I -_ 1 t I - (for integer t,-) 
grows with the block size N. 

For comparison, consider the same encoder structure in 
FIG. 2, except with the roles of g a and g b reversed. Now the 
minimum distances of the three component codes are 5, 3, 
and 3, producing an overall minimum distance of 11 for the 
total code without any permutations. This is apparently a 
better code, but it turns out to be inferior as a turbo code. 
This paradox is explained by again considering the critical 
weight-2 data sequences. For this code, weight-2 sequences 
with l+2t,- zeros separating the two l’s produce self- 
terminating output and, hence, low-weight encoded words. 
In the turbo encoder, such sequences will be permuted to 
have separations l+2t,-, where i=2, 3, for the second and third 
encoders, but each t t is now defined as a multiple of ¥ 1 . Now 
the total encoded weight for integer triplets (t 1? t 2 , t 3 ) is 
ll+22 3 I _ 1 k-. Notice that this weight grows only half as fast 
with 2 3 J ,_ 1 t J , as the previously calculated weight for the 
original code. If can be made to grow with block size 

by the proper choice of an interleaver, then clearly it is 
important to choose component codes that cause the overall 
weight to grow as fast as possible with the individual 
separations t,-. This consideration outweighs the criterion of 
selecting component codes that would produce the highest 
minimum distance if unpermuted. 

There are also many weight-n, n=3, 4, 5, ... , data 
sequences that produce self -terminating output and, hence, 
low encoded weight. However, as argued below, these 
sequences are much more likely to be broken up by random 
interleavers than the weight-2 sequences and are, therefore, 
likely to produce nonterminating output from at least one of 
the encoders. Thus, turbo code structures that would have 
low minimum distances if unpermuted can still perform well 
if the low-weight codewords of the component codes are 
produced by input sequences with weight higher than two. 

We briefly examine the issue of whether one or more 
random interleavers can avoid matching small separations 
between the l’s of a weight-2 data sequence with equally 
small separations between the l’s of its permuted version(s). 
Consider for example a particular weight-2 data sequence (. 
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. . 001001000 . . . ) which corresponds to a low weight 
codeword in each of the encoders of FIG. 2. If we randomly 
select an interleaver of size N, the probability that this 
sequence will be permuted into another sequence of the 
same form is roughly 2/N (assuming that N is large, and 
ignoring minor edge effects). The probability that such an 
unfortunate pairing happens for at least one possible position 
of the original sequence within the block size of N, is 
approximately l-(l-2/N) Ar «l-e _2 . This implies that the 
minimum distance of a two-code turbo code constructed 
with a random permutation is not likely to be much higher 
than the encoded weight of such an unpermuted weight-2 
data sequence, e.g., 14 for the code in FIG. 2. (For the worst 
case permutations, the d min , of the code is still 9, but these 
permutations are highly unlikely if chosen randomly). By 
contrast, if we use three codes and two different interleavers, 
the probability that the particular sequence above will be 
reproduced by both interleavers is only (2/N) 2 . Now the 
probability of finding such an unfortunate data sequence 
somewhere within the block of size N is roughly 1— [1— (2/ 
N) 2 ] Ar «4/N. Thus it is probable that a three-code turbo code 
using two random interleavers will see an increase in its 
minimum distance beyond the encoded weight of an unper- 
muted weight-2 data sequence. This argument can be 
extended to account for other weight-2 data sequences 
which may also produce low weight codewords, e.g., (. . . 
OOIOO(OOO)TOOC) . . . ), for the code in FIG. 2. 

For comparison, consider a weight-3 data sequence such 
as (. . . 0011100 . . . ), which for our example corresponds 
to the minimum distance of the code (using no 
permutations). The probability that this sequence is repro- 
duced with one random interleaver is roughly 6/N 2 , and the 
probability that some sequence of that form is paired with 
another of the same form is l-^-b/N^^bN. Thus, for 
large block sizes, bad weight-3 data sequences have a small 
probability of being matched with bad weight-3 permuted 
data sequences, even in a two-code system. 

For a turbo code using q codes and q-1 random 
interleavers, this probability is even smaller, l-[l-(6/N) ?_ 
lJ^b/N (6/N 2 )^ -2 . This implies that the minimum distance 
codeword of the turbo code in FIG. 2 is more likely to result 
from a weight-2 data sequence of the form 
(. . . 001001000 . . . ) than from the weight-3 sequence ( . 

. . 0011100 . . . ) that produces the minimum distance in the 
unpermuted version of the same code. Higher weight 
sequences have an even smaller probability of reproducing 
themselves after being passed through a random interleaver. 

For a turbo code using q codes and q-1 interleavers, the 
probability that a weight-n data sequence will be reproduced 
somewhere within the block by all q-1 permutations is of 
the form l-fl-^/N" -1 )' 7-1 ]^, where (3 is a number that 
depends on the weight-n data sequence but does not increase 
with block size N. For large N, this probability is propor- 
tional to (1/N which falls off rapidly with N, when n 

and q are greater than two. Furthermore, the symmetry of 
this expression indicates that increasing either the weight of 
the data sequence n or the number of codes q has roughly the 
same effect on lowering this probability. 

In summary, from the above arguments, we conclude that 
weight-2 data sequences are an important factor in the 
design of the component codes, and that higher weight 
sequences have successively decreasing importance. Also, 
increasing the number of codes and, correspondingly, the 
number of interleavers, makes it more and more likely that 
bad input sequences will be broken up by one or more of the 
permutations. 

The minimum distance is not the most important charac- 
teristic of the turbo code, except for its asymptotic 


12 

performance, at very high E^/N,,. At moderate signal-to- 
noise ratios (SNRs), the weight distribution for the first 
several possible weights is necessary to compute the code 
performance. Estimating the complete weight distribution of 
5 these codes for large N and fixed interleavers is still an open 
problem. However, it is possible to estimate the weight 
distribution for large N for random interleavers by using 
probabilistic arguments. For further considerations on the 
weight distribution, see D. Divsalar and F. Pollara, “Turbo 
10 Codes for Deep-Space Communications,” The Telecommu- 
nications and Data Acquisition Progress Report 42-120, 
October-December 1994, Jet Propulsion Laboratory, 
Pasadena, Calif., pp. 29-39, Feb. 15, 1995 (hereby incor- 
porated by reference). 

15 Interleaver Design 

In view of the above discussion, it should be clear that 
interleavers should be capable of spreading low-weight 
input sequences so that the resulting codeword has high 
weight. Block interleavers, defined by a matrix with Vy-rows 
20 and v c columns such that N=vyxv c , may fail to spread certain 
sequences. For example, the weight-4 sequence shown in 
FIG. 3 cannot be broken by a block interleaver. In order to 
break such sequences, random interleavers are desirable, as 
discussed above. A method for the design of non-random 
25 interleavers is discussed in P Robertson, “Illuminating the 
Structure of Code and Decoder of Parallel Concatenated 
Recursive Systematic (Turbo) Codes, Proceedings 
GLOBECOM’94, San Francisco, Calif., pp. 1298-1303, 
December 1994 (hereby incorporated by reference). 

30 Block interleavers are effective if the low-weight 
sequence is confined to a row. If low-weight sequences 
(which can be regarded as the combination of lower-weight 
sequences) are confined to several consecutive rows, then 
the v c columns of the interleaver should be sent in a specified 
35 order to spread as much as possible the low-weight 
sequence. A method for reordering the columns is given in 
E. Dunscombe and F. C. Piper, “Optimal interleaving 
scheme for convolutional codes”, Electronic Letters , Oct. 
26, 1989, Vol. 25, No. 22, pp. 1517-1518 (hereby incorpo- 
40 rated by reference). This method guarantees that for any 
number of columns v c =aq+r, (r^a-1), the minimum sepa- 
ration between data entries is q-1, where a is the number of 
columns affected by a burst. However, as can be observed in 
the example in FIG. 3, the sequence “1001” will still appear 
45 at the input of the encoders for any possible column per- 
mutation. Only if we permute the rows of the interleaver in 
addition to its columns is it possible to break the low-weight 
sequences. The method in Bahl et al. can be used again for 
the permutation of rows. Appropriate selection of a and q for 
50 rows and columns depends on the particular set of codes 
used and on the specific low-weight sequences that are to be 
broken. 

We have also designed semi-random permuters 
(interleavers) by generating random integers i, l^i^N, 
55 without replacement. We define an “S-random” permutation 
as follows: each randomly selected integer is compared to S 
previously selected integers. If the current selection is equal 
to any S previous selections within a distance of ±S, then the 
current selection is rejected. This process is repeated until all 
60 N integers are selected. While the searching time increases 
with S, we observed that choosing S<(N/2) 0-5 usually pro- 
duces a solution in reasonable time. (S=l results in a purely 
random interleaver). In simulations, we used S=ll for 
N=256, and S=31 for N=4096. 

65 The advantage of using three or more constituent codes is 
that the corresponding two or more interleavers have a better 
chance to break sequences that were not taken care by 
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another interleaver. The disadvantage is that, for an overall 
desired code rate, each code must be punctured more, 
resulting in weaker constituent codes. In our experiments, 
we have used randomly selected interleavers and interleav- 
ers based on the row-column permutation described above. 5 
In general, random interleavers and S-random interleavers 
are good for low SNR operation (e.g., PCS applications 
requiring P fc =10 -3 ), where the overall weight distribution of 
the code is more important than the minimum distance. 
Terminated Parallel Convolutional Codes as Block Codes 10 
Consider the combination of permuter and encoder as a 
linear block code. Define P t - as the parity matrix of the 
terminated convolutional code i. Then the overall generator 
matrix for three parallel codes is G=[I PiJt 2 P 2 Jt 3 P 3 ], where 
jt; are the permuters (interleavers). In order to maximize the 15 
minimum distance of the code given by G, we should 
maximize the number of linearly independent columns of 
the corresponding parity check matrix H. This suggests that 
the design of P t - (code) and Jt,- (permutation) are closely 
related, and it does not necessarily follow that optimum 20 
component codes (maximum d min ) yield optimum parallel 
concatenated codes. For very small N, we used this concept 
to design jointly the permuter and the component convolu- 
tional codes. 

Termination 25 

The encoder of FIG. 2 was used to generate an (n(N+M), 

N) block code, where the M tail bits of code 2 and code 3 
are not transmitted. Since the component encoders are 
recursive, it is not sufficient to set the last M information bits 
to zero in order to drive the encoder to the all- zero state, i.e., 30 
to terminate the trellis. The termination (tail) sequence 
depends on the state of each component encoder after N bits, 
which makes it impossible to terminate all component 
encoders with M predetermined tail bits. 

FIG. 4 is a block diagram of a general single input coder 35 
(the code is not important). However, the inventive termi- 
nation technique can be applied to b-input coders, where 
bi^l. Trellis termination is performed by setting the 
switches shown in FIG. 4 to position B to permit selective 
feedback as shown from the taps between delay elements D. 40 
The tap coefficients a l0 , . . . a i m _ 1 for i=l,2, . . . b can be 
obtained by repeated use of the following equation, and by 
solving the resulting equations: 


S k (D) = 


u k hi(D) + DS k ~ l (D) 


mod h 0 (D) 


U 3 ) 45 


where S*(D) is the state of the encoder at time k with 
coefficients S* 0 , S k l9 . . . S k m _ ± for input u k l9 . . . u k b . The 50 
trellis can be terminated in state zero with at least m/b and 
at most m clock cycles. When multiple input bits are used 
(parallel feedback shift registers), a switch should be used 
for each input bit. 

New Structural Designs 55 

The following further describes several novel structures 
for turbo code encoders that apply the principals set forth 
above. 

FIG. 5 is a block diagram that shows a turbo code encoder 
having at least two coding modules C l9 C n and at least one 60 
interleaver Jt n for each of the second and subsequent coding 
modules. Additional coding modules with corresponding 
interleavers may be added as desired. Notably, this structure 
outputs only encoded parity elements X n from the coding 
modules C — the original data source elements d are not 65 
transmitted or stored. The decoder structure described below 
is capable of reconstituting d only from the received ele- 


ments Yn corresponding to the encoded elements Xn. The 
structure shown in FIG. 5 has good performance (i e., low 
BER for a given SNR), is less complex than the prior art, and 
permits a simple decoder to be used for C, (see, for example, 
FIG. 10) 

FIG. 6 A is a block diagram that shows a turbo code 
encoder having at least two input data lines d 1 , d m coupled 
as shown to at least two coding modules C 1? C n . In addition, 
each data line is coupled through a corresponding interleaver 
K n m to each of the second and subsequent coding modules. 
The codes for each coding module may differ, and the 
number of outputs from each coder may differ. Further, not 
all data lines need be applied to all coding modules. 
Examples of multiple input encoders are shown in FIGS. 9 
and 21. Again, additional data lines and coding modules 
with corresponding interleavers may be added as desired. 
Also shown in dotted outline is an optional multilevel 
trellis-coded modulator M described below. The structure 
shown in FIG. 6 is particularly useful for generating output 
to a binary modulator (such as shown in FIG. 9) or a 
multilevel modulator (such as a trellis code modulator as 
shown in FIG. 21). 

FIG. 6B is a block diagram showing a variation of FIG. 
6A, in which a turbo code encoder has at least one input data 
line d coupled as shown to only one coding module C, 
directly and through at least one interleaver (two, jt 1? and jt 2 , 
are shown by way of illustration). This structure is a 
self-concatenated coder. Outputs u 1 and u 2 contribute to the 
encoding function but need not be transmitted. A decoder for 
this encoder is shown in FIG. 20D, as described below. 

FIG. 6B2 is a block diagram showing a variation of a 
self-concatenated code, where the encoder has at least one 
input data line d, and d is sent to the modulator. Each 
incoming bit of d is repeated m times, multiplexed and 
interleaved to generate the data line u which enters the 
systematic recursive convolutional code C. Systematic bits 
at the output of C are not transmitted. Only the parity bits or 
a punctured version of them are used at the output of C. 

FIG. 6C is a block diagram that shows a turbo code 
encoder having at least two input data lines d 1? d m coupled 
as shown to at least two coding modules C 1? C n . In addition, 
each data line is coupled through a plurality of correspond- 
ing interleavers 7t n l 0 , Jt nml to each of the second and 
subsequent coding modules. The number of interleavers per 
coding module need not be the same, and the codes for each 
coding module may differ. Again, additional data lines and 
coding modules with corresponding sets of interleavers may 
be added as desired. Also shown in dotted outline is an 
optional multilevel trellis-coded modulator M described 
below. The structure shown in FIG. 6C is particularly useful 
for generating output to a binary modulator (such as shown 
in FIG. 9) or a multilevel modulator (such as a trellis code 
modulator as shown in FIG. 21). It also generates a more 
random encoding of input data than the structure shown in 
FIG. 6A, thus generally providing good performance (i.e., a 
lower bit error rate for a particular signal to noise ratio). 

FIG. 7 A is a block diagram of a serial encoder in 
accordance with the present invention. At least one data 
stream d is passed through at least one “pre” coder C 0 to 
generate a stream of coded bits (coded bits u and p are shown 
by way of illustration), which are applied to at least one 
permuter jt, the output of which is applied to at least one 
“post” coder C 1 . If desired, the permuter jt can be con- 
structed from multiple parallel permuters. Preliminary data 
indicates good performance for this structure. If the encoder 
C 1 is a systematic recursive convolutional code, the “pre- 
” coder C 0 can be non-systematic or systematic, but not 



6,023,783 


15 

necessarily recursive. Both C 0 and C 2 can be punctured to 
adjust the overall code rate. A decoder for this encoder is 
shown in FIG. 20E, as described below. 

FIG. 7B is a block diagram of a parallel-serial encoder in 
accordance with the present invention. At least one data 5 
stream d is passed through a first coder C 0 to generate at least 
one stream of code bits (Coded bits u and p are shown by 
way of illustration; note that both u and p are preferred to be 
identical steams, u=p in FIG. 7B), which are applied to 
respective at least two permuters jc 1 and 3X 2 . The output of 
the permuters is applied to at least two separate coders C ± , 

C 2 , as shown. Preliminary data indicates good performance 
for this structure if u and p are identical coded streams. A 
decoder for this encoder is shown in FIG. 20F, as described 
below. 

FIG. 7C is a block diagram of a serial-parallel hybrid 15 
encoder in accordance with the present invention. At least 
one data stream d is passed through a first coder C 0 to 
generate at least one stream p of coded bits, which are 
applied to permuter Jt 2 . The output of permuter Jt 2 (denoted 
as u 2 ) is applied to coder C 2 , producing coded output q 2 . The 20 
input data stream d is also permuted by permuter jt 1? 
producing the output denoted as u 2 . Signal is applied to 
coder C 1 , producing coded output q r 

It should be noted that the structures shown in FIGS. 5-7 
are general in nature, and provide advantages independent of 2 s 
specific interleavers and coders. Additional advantages are 
provided if those coders connected to modulation or chan- 
nels and preceded by an interleaver produce high output 
weight for input weight one. This can be achieved, for 
example, with recursive convolutional codes. 3Q 

Performance and Simulation Results 

The following sets forth results from applying the prin- 
cipals set forth above. 

A. Performance of Various Rate Codes 
FIG. 8 shows the performance of turbo codes with m ^ 
iterations and an interleaver size of N=16,384. The follow- 
ing codes are used as examples: 

(1) Rate V2 Turbo Codes. 

Code A: Two 16 -state, rate 2 /4 constituent codes are used 
to construct a rate V 2 turbo code as shown in FIG. 9. The 4Q 
(worst-case) minimum codeword weights, d,-, corre- 
sponding to a weight-i input sequence for this code, are 
d e y=14, d 3 =7, d 4 =8, d 5 =5=d mm , and d 6 =6. 

Code B: A rate V2 turbo code also was constructed by 
using a differential encoder and a 32-state, rate Vi code, 45 
as shown in FIG. 10. This is an example where the 
systematic (information) bits applied to both encoders 
are not transmitted. The (worst-case) minimum code- 
word weights, d,-, corresponding to a weight-i input 
sequence for this code, are d e y=19, d 4 =6=d min , d 6 =9, 50 
d 8 =8, and d 10 =ll. The output weights for odd i are 
large. 

(2) Rate Vs Turbo Code. 

Code C: Two 16 -state, rate V2 constituent codes are used 
to construct a rate V 3 turbo code as shown in FIG. 11. 55 
The (worst-case) minimum codeword weights, d,-, cor- 
responding to a weight-i input sequence for this code, 
are d e/ =22, d 3 =ll, d 4 =12, d s =9=d mm , d 6 =14, and 
d 7 =15. 

(3) Rate Va Turbo Code. 60 

Code D: Two 16 -state, rate Vi and rate V 3 constituent codes 

are used to construct a rate Vi turbo code, as shown in 
FIG. 12, with d er 32, d 3 =15=d m ,„, d 4 =16, d 5 =17, 
d 6 =16, and d 7 =19. 

(4) Rate Vis Turbo Code 65 

Code E: Two 16-state, rate Vs constituent codes are used 

to construct a rate Vis turbo code, (1, g ± /g 0 , g 2 /g 0 > g 3 /go> 
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g 4 /go> g 5 /go. ge/go. gy/go) and (gi/go> g 2 /go> g 3 /go> gVgo. 
g 5 /go> ge/go. gv/go). with goK 23 )^. gi=( 2 l )ocai> g 2 = 
(2S) octa i, g 3 = ( 22 )ocra/> g 4 — O I } octal' gs ~(H)octal’ g6~ 

(?S) octal , and g 7 =(37)„ M/ . The (worst-case) minimum 
codeword weights, d,-, corresponding to a weight i input 
sequence for this code are d ej =lA2, d s =39=d min , d 4 =48, 
d s =45, d 6 =50, and d 7 =63. 

B. Performance of Two Codes 

The performance obtained by turbo decoding the code 
with two constituent codes (1, g b /g a ), where g a =(37) octal and 
g b =(21) octa i, and with random permutations of lengths 
N=4096 and N= 16,384 is compared in FIG. 13 to the 
capacity of a binary-input Gaussian channel for rate r=Vi. 
The best performance curve is approximately 0.7 dB from 
the Shannon limit at BER=10 -4 . 

C. Unequal Rate Encoders 

We now extend the results to encoders with unequal rates 
with two K=5 constituent codes (1, gjg a , g ( ,/gj and (g b /g a ), 
where g a =(37) octal , g b =(33) octal and g c =(25) octal . This struc- 
ture improves the performance of the overall, rate 14 , code, 
as shown in FIG. 13. This improvement is due to the fact that 
we can avoid using the interleaved information data at the 
second encoder and that the rate of the first code is lower 
than that of the second code. For PCS applications, for 
example, short interleavers should be used, since the 
vocoder frame is usually 20 ms. Therefore we selected 192 
bit and 256 bit interleavers as an example, corresponding to 
9.6 and 13 Kbps. (Note that this small difference of inter- 
leaver size does not affect significantly the performance). 
The performance of codes with short interleavers is shown 
in FIG. 14 for the K=5 codes described above for random 
permutation and row-column permutation with a=2 for rows 
and a=4 for columns. 

D. Performance of Three Codes 

The performance of two different three-code turbo codes 
with random interleavers is shown in FIG. 15 for N=4096. 
The first code uses three recursive codes shown in FIG. 2 
with constraint length K=3. The second code uses three 
recursive codes with K=4, g a =(13) octa „ and g i ,=(ll) OCM/ . 
Note that the nonsystematic version of the second encoder is 
catastrophic, but the recursive systematic version is non- 
catastrophic. We found that this K=4 code has better per- 
formance than several others. 

The performance of the K=4 code was improved by going 
from 20 to 30 iterations. We found that the performance 
could also be improved by using an S -random interleaver 
with S-31. For shorter blocks (192 and 256 bits), the results 
are shown in FIG. 14, where it can be observed that 
approximately 1 dB SNR is required for BER=10 -3 , which 
implies, for example, a CDMA capacity C=0.8r|. We have 
noticed that the slope of the BER curve changes around 
BER=10 -5 (flattening effect) if the interleaver is not 
designed properly to maximize d min or is chosen at random. 
Turbo Code Decoders 

The turbo decoding configuration proposed by Berrou for 
two codes is shown schematically in FIG. 16. This configu- 
ration operates in serial mode, i.e., decoder DEC1 processes 
data before decoder DEC2 starts operation, and so on. 
However, we show below an improved decoder configura- 
tion and its associated decoding rule based upon a parallel 
structure for three or more codes. 

FIG. 17 is a block diagram of a parallel decoder structure 
in accordance with the present invention. Decoder DEC1 
processes data in parallel with decoders DEC2 and DEC3, 
and each passes output to the other decoders at each of a 
plurality of stages, as shown. Self loops are not allowed in 
these structures since they cause degradation or divergence 
in the decoding process (positive feedback). 
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We have determined that the parallel structure shown in 
FIG. 17 has better performance than the prior art series 
decoder. To demonstrate this, let u k be a binary random 
variable taking values in {0,1}, representing the sequence of 
information bits u=(u 1? . . . , Ujf). The MAP (maximum a 5 
posteriori) probability algorithm described by Bahl et al., 
supra, provides the log likelihood ratio L k , given the 
received symbols y: 


L k = log 


Am k = 1 1 y) 
P(u k =0|.y) 


( 14 ) 10 



«:«*=! 


Ay I «)]^[ J\uj) 

j*k 


L 


Ay I «) n P(uj) 

j*k 


+ log 


P(y k = i) 

P(u k = 0) 


(15) 
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For efficient computation of Eq. (2) when the a priori 
probabilities P(u 7 ) are nonuniform, the modified MAP algo- 20 
rithm in J. Hagenauer and P. Robertson, “Iterative (Turbo) 
Decoding of Systematic Convolutional Codes With the 
MAP and S OVA Algorithms,” Proc. of the ITG Conference 
on Source and Channel Coding (Frankfurt, Germany, Octo- 
ber 1994) is simpler to use. Therefore, we use the modified 25 
MAP algorithm. 

If the rate b/n constituent code is not equivalent to a 
punctured rate 1/n’ code or if turbo trellis coded modulation 
is used, we can first use the symbol MAP algorithm to 
compute the log-likelihood ratio of a symbol u=u 1? u 2 , . . . , 30 
u b given the observation y as: 


A (u) = log 


Pju\y) 

Ady) 


(16) 


35 


where 0 corresponds to the all-zero symbol. Then we obtain 
the log-likelihood ratios of the jth bit within the symbol by: 


L(u ; ) = log + 

V e*<“> 

^ A *=1 


n pm 


0?) 40 


In this way, the turbo decoder operates on bits and bit 
interleaving, rather than symbol interleaving, is used. 

FIG. 18 is a block diagram of a channel model, where the 
n^’s and n ^’s are independent identically distributed (i.i.d.) 
zero-mean Gaussian random variables with unit variance, 5Q 
p=(2rE fc /N 0 ) 0 ' 5 is the signal to noise ratio, and r is the code 
rate. The same model is used for each encoder. To explain 
the basic decoding concept, we restrict ourselves to three 
codes, but extension to several codes is straightforward. In 
order to simplify the notation, consider the combination of 55 
permuter and encoder as a block code with input u and 
outputs x,-, i=0, 1, 2, 3 (x 0 =u), and the corresponding 
received sequences y i9 i=0, 1, 2, 3. The optimum bit decision 
metric on each bit is (for data with uniform a priori 
probabilities): 6Q 

L _ lo y P(yo I u)P(yi I u)P(y 2 | u)P(y 3 \ u ) (18) 

* " ° S Zj Z P(yo I u)P(y 1 | u)P(y 2 | u)P(y 3 \ u ) 

«:«* = ! 

65 

In practice, we cannot compute Eq. (5) for large N 
because the permutations jt 2 , Jt 3 imply that Y 2 and y 3 are no 
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longer simple convolutional encodings of u. Suppose that 
we evaluate P(yju), i=2, 3 in Eq. (5) using Bayes’ rule and 
using the following approximation: 


N 

Am I yi) w Y\ ^ Uk) 


k=l 


(19) 


Note that P(u|y,-) is not separable in general. However, for 
i=0, P(u|y,-) is separable; hence, Eq. (6) holds with equality. 
If such an approximation can be obtained, we can use it in 
Eq. (5) for i=2 and i=3 (by Bayes’ rule) to complete the 
algorithm. A reasonable criterion for this approximation is to 
choose the right-hand term of Eq. 6 such that it minimizes 
the Kullback distance or free energy. Define L ik by: 


e u k~ L ik 

PM = — 

1 + e L ik 


( 20 ) 


where u^e {0,1}. . Then the Kullback distance is given by: 

Jh-At <- 2 » 

N H 

no + e L ‘k j no + e L ik jp(wl)'i) 



Minimizing Eq. 8 involves forward and backward recur- 
sions analogous to the MAP decoding algorithm. Instead of 
using Eq. (8) to obtain {P-}, or equivalently {L^}, we use 
Eqs. (6) and (7) for i =0, 2, 3 (by Bayes’ rule) to express Eq. 
(5) as: 


Ft-f(yiXo> L 2 , L3, k)+L 0 jt+L2jt+L 3jt (22) 

where L 0Ar =2py 0Ar (for binary modulation) and: 

y <23) 

u:u k = 1 

f{yi, h, h., h,k) = log^— — 

V P( yi \u)Y]e u J^J +L ^J + ^ 


We can use Eqs. (6) and (7) again, but this time for i=0, 1, 
3 to express Eq. (5) as: 


L *-f(Y2» L 0 , L 1; L 3 , k)+L 0jt +L 1Jt +L 3jt 


(24) 


and similarly, 


L/t-f(y 3 , L 0 , L l5 L 2 , k)+L 0jt +L 1Jt +L 2jt (25) 

A solution to Eqs. (9), (11), and (12) is: 


L 0 , Lz, L 3 , k) 

(26) 

L2jt=f(y 2 > L 0 , L 1; L 3 , k) 

(27) 

L 3J t=fl(y3, L 0 , L 1; L 2 , k) 

(28) 


for k=l, 2, . . . , N (provided that a solution to Eqs. (13-15) 
does indeed exist). The final decision is then based on: 
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La=LoA + LiA+L 2 A: + L3A: (29) 

which is passed through a hard limiter with zero threshold. 

We attempted to solve the nonlinear equations in Eq. (11) for 5 
L 1? L 2 , and L 3 by using the iterative procedure: 

iS m+1 \ k =a< m \ f(y 1; L 0 , U m \, Uk, k) (30) 

10 

for k=l, 2, . . . , N, iterating on m. Similar recursions hold 
for L (m) 2 * and The gain a (m \ should be equal to one, 

but we noticed experimentally that better convergence can 
be obtained by optimizing this gain for each iteration, 
starting from a value slightly less than one and increasing 15 
toward one with the iterations, as is often done in simulated 
annealing methods. 

We start the recursion with the initial condition L (0) 1 = 
L (0) 2 =L (:o) 3 =L 0 (note that the components of the L corre- 
sponding to the tail bits are set to zero for all iterations). For 20 
the computation of f (•), we preferably use the modified 
MAP algorithm as described in “Turbo Codes for Deep- 
Space Communications”, supra, with permuters (direct and 
inverse) where needed. Call this basic decoder D £ , i=l, 2, 3. 
The L i=l, 2, 3 represent the extrinsic information. The 
signal flow graph for extrinsic information is shown in FIG. 

19, which is a fully connected graph without self- loops. 
Parallel, serial, or hybrid implementations can be realized 
based on the signal flow graph of FIG. 19 (in this figure, y 0 
is considered part of y 2 ). Based on the equations above, each 
node’s output is equal to internally generated reliability L 
minus the sum of all inputs to that node. 

FIG. 20A is a block diagram of a single parallel block 
decoder in accordance with the present invention. Inputs 35 
include feedback terms U m \ and L (m) 3 , and input terms L 0 
and y 2 , as described above. Direct permuter (interleaver) jt 2 
is coupled to a MAP function block as shown, which in turn 
is coupled to the corresponding inverse permuter Jt _1 2 . 

In all instances, the MAP algorithm always starts and ends 40 
at the all- zero state since we always terminate the trellis as 
described above or in “Turbo Codes for Deep -Space 
Communications”, supra. Similar structures apply for block 
decoder 1 (we assumed identity; however, any % can 
be used) and block decoder 3 in a three code system. The 45 
overall decoder is composed of block decoders connected as 
in FIG. 17, which can be implemented as a pipeline or by 
feedback. 

An alternative design to that shown in FIG. 20 A, which is 
more appropriate for use n turbo trellis coded modulation or 50 
when the systematic bits are not transmitted, sets L o =0 and 
considers y 0 as part of y 1 (that is, no direct use is made of 
the received term corresponding to the original signal data 
d). Even in the presence of systematic bits, if desired, one 
can set L o =0 and consider y 0 as part of y 1 . If the systematic 55 
bits are distributed among encoders, we use the same 
distribution of y 0 among the MAP decoders. 

FIG. 20B is a block diagram showing a multiple turbo 
code decoder for a three code system, using three blocks 
similar to the decoder in FIG. 20A. In this embodiment, the 60 
parity output x ip , x 2p , x 3p of the encoder shown in FIG. 2, 
received as y 1? y 2 , y 3 , can be used to reconstruct d. This 
decoder can also be configured to a two code system and for 
more than three codes. 

FIG. 20C is a block diagram showing a multiple turbo 65 
code decoder for a three code system, using three blocks 
similar to the decoder in FIG. 20A. This embodiment shows 
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a serial implementation when the switches are in position S 
and the delay elements are present. This decoder can also be 
configured to a two code system and for more than three 
codes. 

At this point, further approximation for turbo decoding is 
possible if one term corresponding to a sequence u domi- 
nates other terms in the summation in the numerator and 
denominator of Eq. (10). Then the summations in that 
equation can be replaced by “maximum” operations with the 
same indices, i.e., replacing Z u:iik=i with max u:Uk=i for i=0,l. A 
similar approximation can be used for L 2Ar and L 3k Eqs. 
(13)-(15). This suboptimum decoder then corresponds to a 
turbo decoder that uses soft output Viterbi (SOVA)-type 
decoders rather than MAP decoders. Accordingly, FIG. 20B 
indicates that the decoders may be MAP or SOVA decoders. 
Further approximations, i.e., replacing 2 with max, can also 
be used in the MAP algorithm. 

FIG. 20D is a block diagram showing a decoder corre- 
sponding to the self-concatenating coder of FIG. 6B. The 
MAP decoder for decoding the specific embodiment shown 
in FIG. 6B generates the reliabilities of u 1? u 2 , and d. The 
input reliabilities to the MAP decoder are subtracted from 
the proper deinterleaved reliabilities and are fed back to the 
same decoder as shown in FIG. 20D The new input reli- 
ability to the MAP decoder for u 1 is the interleaved version 
of L u +L d ; for u2 is the interleaved version of L u +L d ; and 
for d is L„ +L„ . At the first iteration, the decoder starts with 
zero input reliabilities. Using the received observation (ie., 
the noisy version of d and p if u 2 and u 2 were not transmit- 
ted; see FIG. 6B), the MAP decoder generates the new 
reliabilities for u 1 , u 2 , and d. At the second iteration, all input 
reliabilities are non-zero. The decoder proceeds in the way 
described above for as many iterations as desired. Since 
there is only one MAP decoder, we call it self-iterative 
decoder. 

FIG. 20D2 is a block diagram showing a decoder corre- 
sponding to the self-concatenated coder of FIG. 6B2. L u is 
generated by the MAP decoder for coder C. L u is 
deinterleaved, demultiplexed and provided to the adders as 
shown in FIG. 20D2 (where repetition, m=3, is used as an 
example). The normalized observation for d is added 
through the same adders. The outputs of the adders are 
multiplexed and are fed to the interleaver. The output of the 
interleaver represents the input reliabilities L u for the MAP 
decoder. The whole decoder iterates as many times as 
desired. The final decision is made at the last iteration by 
adding the demultiplexer output to the observation d, and 
hard-limiting the result. 

FIG. 20E is a block diagram showing a decoder corre- 
sponding to the serial coder of FIG. 7 A. The MAP decoder 
C 0 in FIG. 20E is modified in order to generate not only the 
reliability for the input data d but also the reliability for 
coded bits u and p. This can be done by treating the input 
reliabilities for u and p coming from the MAP decoder C x as 
new received observations for MAP decoder C 0 . In the trellis 
representation (which is required for the MAP algorithm) of 
code C 0 , on each branch of the trellis, we treat u and p 
similarly to d, as if they are used as input to encoder C 0 . In 
this way we can generate the reliability of u and p in a 
manner similar to how we generate the reliability of d in the 
original MAP algorithm. This simple stratagem provides the 
required modified MAP algorithm. The original and modi- 
fied methods are illustrated in FIG. 20E2. 

FIG. 20F is a block diagram showing a decoder corre- 
sponding to the parallel-serial coder of FIG. 7B. Decoder 
D 0 , based on the MAP algorithm, accepts as an input the 
reliability of parity bits generated by decoder D 2 , and 
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generates the new extrinsic information on the parity p 
(using the MAP algorithm), which is passed to decoder D 2 . 

FIG. 20G is a block diagram showing a decoder corre- 
sponding to the hybrid concatenated code (serial-parallel, 
type II) of FIG. 7C. The MAP decoder C2, after receiving 
the observation q 2 , generates the quantity L Ui , which, after 
passing through a deinterleaver, produces the input reliabil- 
ity Lp of code bits for C 0 . The MAP decoder C l9 after 
receiving observation ql, generates the quantity L Ui , which, 
after passing through a deinterleaver, produces the input 
reliability L d of data bits for C 0 . The modified MAP decoder 
C 0 (as explained for FIG. 20E) accepts L p and L d as input 
reliabilities and generates the quantity L p (for coded bit p) 
and L d (for data bits d). L p is provided to the MAP decoder 
for C 2 through interleaver jt 2 . L d is provided to the MAP 
decoder through interleaver jz 1 . The whole decoder iterates 
as many times as desired. The decoded bits are obtained by 
hard-limiting the reliabilities for d provided by MAP 
decoder C 0 . 

Multiple-Code Algorithm Applied to Decoding of Two 
Codes 

For turbo codes with only two constituent codes, Eq. (17) 
reduces to 

L'” +1 i 4 =aW 1 f(y 1 , Lo, £.<">* k) (31) 

L'" tl M =a <m) 2f(y2, Lo, k) (32) 

for k=l, 2, . . . , N and m=l, 2, . . . , where, for each iteration, 
a <jrC) 1 and a (m) 2 can be optimized (simulated annealing) or 
set to 1 for simplicity. The decoding configuration for two 
codes reduces to duplicate copies of the structure in FIG. 16 
(ie., to the serial mode). 

If we optimize and a (m) 2 , our method for two codes 
is similar to the decoding method proposed by Berrou. 
However, our method with and a (m) 2 equal to 1 is 

simpler and achieves the same performance reported in 
Robertson, supra, for rate Vi codes. 

Turbo Trellis- Coded Modulation 

A pragmatic approach for turbo codes with multilevel 
modulation has been was proposed in S. LeGoff, A. 
Glavieux, and C. Berrou, “Turbo Codes and High Spectral 
Efficiency Modulation”, Proceedings of the IEEE ICC' 94, 
New Orleans, La., pp. 645-651, May 1-5, 1994. Here we 
propose a different approach that out performs those results 
when M-ary quadrature amplitude modulation (M-QAM) or 
M-ary phase shift keying (MPSK) modulation is used. 

A straightforward method for the use of turbo codes for 
multilevel modulation is the following: 

(1) select a rate b/(b+l) constituent code, where the outputs 
are mapped to a 2 &_1 -level modulation based on Unger- 
boeck’s set partitioning method (G. Ungerboeck, “Chan- 
nel Coding With Multi-Level Phase Signals”, IEEE 
Transactions on Information Theory, vol. IT-28, pp. 
55-67, January 1982) (ie., we can use Ungerboeck’s 
codes with feedback). 

(2) If MPSK modulation is used, for every b bits at the input 
of the turbo encoder, we transmit two consecutive 2 b+1 
phase-shift keying (PSK) signals, one per each encoder 
output. This results in a throughput of b/2 bits/s/Hz. 

(3) If M-QAM modulation is used, we map the b+1 outputs 
of the first component code to the 2 b+1 quadrature levels 
(Q-channel). The throughput of this system is b bits/s/Hz. 
First, we note that these methods require more levels of 

modulation than conventional trellis-coded modulation 
(TCM), which is not desirable in practice. Second, the input 
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information sequences are used twice in the output modu- 
lation symbols, which also is not desirable. One remedy is 
to puncture the output symbols of each trellis code and select 
the puncturing pattern such that the output symbols of the 
5 turbo code contain the input information only once. If the 
output symbols of the first encoder are punctured, for 
example, as 101010 . . . , the puncturing of the second 
encoder must be nonuniform to guarantee that all informa- 
tion symbols are used, and it depends on the particular 
10 choice of interleaver. Now, for example, for 2 b+1 PSK, a 
throughput b can be achieved. This method has two draw- 
backs: It complicates the encoder and decoder, and the 
reliability of punctured symbols may not be fully estimated 
at the decoder. A better remedy, for rate b/(b+l) (b even) 
15 codes, is discussed in the next section. 

A New Method to Construct Turbo TCM 

For a q=2 turbo code with rate b/(b+l) constituent 
encoders, select the b/2 systematic outputs and puncture the 
rest of the systematic outputs, but keep the parity bit of the 
20 b/(b+l) code (note that the rate b/(b+l) code may have been 
obtained already by puncturing a rate code). Then do the 
same to the second constituent code, but select only those 
systematic bits that were punctured in the first encoder. This 
method requires at least two interleavers: the first interleaver 
25 permutes the bits selected by the first encoder and the second 
interleaver permutes those bits punctured by the first 
encoder. For MPSK (or M-QAM), we can use 2 1+b/2 PSK 
symbols (or 2 1+b/2 QAM symbols) per encoder and achieve 
throughput of b/2. For M-QAM, we can also use 2 1+b/z 
30 levels in the I-channel and 2 1+b/2 levels in the Q-channel and 
achieve a throughput of b bits/s/Hz. 

These methods are equivalent to a multidimensional 
trellis-coded modulation scheme (in this case, two multi- 
level symbols per branch) that uses 2 b/2 x2 1+b/2 symbols per 
35 branch, where the first symbol in the branch (which depends 
only on uncoded information) is punctured. Now, with these 
methods, the reliability of the punctured symbols can be 
fully estimated at the decoder. Obviously, the constituent 
codes for agiven modulation should be redesigned based on 
40 the Euclidean distance. 

EXAMPLES 

The first example is for b=2 with 16 QAM modulation 
where, for simplicity, we can use the 2 A codes in Table I 
above with Gray code mapping. Note that this may result in 
suboptimum constituent codes for multilevel modulation. A 
turbo encoder with 16 QAM and two clock-cycle trellis 
termination is shown in FIG. 21. The BER performance of 
this code with the turbo decoding structure for two codes 
discussed above is given in FIG. 22. For permutations tz 1 and 
jt 2 , we used S-random permutations with S=40 and S=32, 
with a block size of 16,384 bits. Throughput was 2 bits/s/Hz. 

For 8 PSK modulation, we used two 16-state, rate Vs 
55 codes given above to achieve a throughput of 2 bits/s/Hz. 
The parallel concatenated trellis codes with 8 PSK and two 
clock-cycle trellis termination is shown in FIG. 23. The BER 
performance of this code is given in FIG. 24. 

For 64 QAM modulation, we used two 16-state, rate Vs 
60 codes given above to achieve a throughput of 4 bits/s/Hz. 
The parallel concatenated trellis codes with 64 QAM and 
two clock-cycle trellis termination is shown in FIG. 25. The 
BER performance of this code is given in FIG. 26. 

For permutations jt 1? jt 2 , jt 3 , jt 4 in FIGS. 23 and 25, we 
65 used random permutations, each with a block size of 4096 
bits. As discussed above, there is no need to use four 
permutations; two permutations suffice, and may even result 
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in a better performance. Extension of the described method 
for construction of turbo TCM based on Euclidean distance 
is straightforward. 

Application of TCM to the turbo code structures shown 
here in provides a number of advantages, including power 
efficiency and bandwidth efficiency, resulting in a higher 
data rate. 

FIG. 27 is a block diagram showing application of a TCM 
module M in combination with a conventional two code 
turbo coder to give the advantages noted above. In addition, 
such a module M is shown in outline in FIGS. 6 and 7. It 
should be noted that the structures shown in FIGS. 6, 7, and 
27 are general in nature, and provide advantages indepen- 
dent of specific interleavers, coders, and TCM modules. 

FIG. 28 is a block diagram showing a general iterative 
decoder structure for the TCM encoded output of, for 
example, FIGS. 21, 23, and 25. 

Conclusion 

Further information about some aspects of the present 
invention, such as proofs of theorems, may be found in the 
following articles, which are hereby incorporated by refer- 
ence: 

D. Divsalar and F. Pollara, “Multiple Turbo Codes for 
Deep-Space Communications”, The Telecommunications 
and Data Acquisition Progress Report 42-121, 
January-March 1995, Jet Propulsion Laboratory, 
Pasadena, Calif., pp. 66-77, May 15, 1995. 

D. Divsalar and F. Pollara, “Turbo Codes for PCS 
Applications”, Proceedings of IEEE ICC '95, Seattle, 
Wash., pp. 54-59, June 1995. 

D. Divsalar and F. Pollara, “Turbo Codes for Deep-Space 
Communications”, IEEE Communication Theory 
Workshop, Apr. 23-26, 1995, Santa Cruz, Calif. 

D. Divsalar and F. Pollara, “Low-rate Turbo Codes for 
Deep-Space Communications”, IEEE International Sym- 
posium on Information Theory, September 17-22, 
Whistler, Canada. 

D. Divsalar and F. Pollara, “Multiple Turbo Codes”, MIL- 
COM 95, San Diego, Calif., Nov. 5-8, 1995. 

D. Divsalar and F. Pollara, “On the Design of Turbo Codes”, 
The Telecommunications and Data Acquisition Progress 
Report 42-123, July-September 1995, Jet Propulsion 
Laboratory, Pasadena, Calif., pp. 99-121, Nov. 15, 1995. 
A number of embodiments of the present invention have 
been described. Nevertheless, it will be understood that 
various modifications may be made without departing from 
the spirit and scope of the invention. For example, where 
specific values (e.g., for interleaver size) are given, other 
values generally can substituted in known fashion. A par- 
ticular encoder may be implemented as a hardware device 
while the corresponding decoder is implemented in 
software, for vice versa. Accordingly, it is to be understood 
that the invention is not to be limited by the specific 
illustrated embodiment, but only by the scope of the 
appended claims. 

What is claimed is: 

1. A system for error-correction coding of a source of 
original digital data elements, comprising: 

(a) a first systematic convolutional encoder, coupled to the 
source of original digital data elements, for generating 
a first series of coded output elements derived from the 
original digital data elements; 

(b) at least one interleaver, each coupled to the source of 
original digital data elements, for modifying the order 
of the original digital data elements to generate respec- 
tive sets of interleaved elements; and 
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(c) at least one next systematic convolutional encoder, 
each coupled to respective interleaved elements, each 
for generating a corresponding next series of coded 
output elements derived from a respective set of inter- 
5 leaved elements, each next series of coded output 
elements being in parallel with the first series of coded 
output elements; 

wherein the system for error-correction coding outputs 
only the first series of coded output elements and each 
10 next series of coded output elements. 

2. The system of claim 1, further including a decoder for 
receiving signals representative of at least some of the first 
series of coded output elements and of at least some of each 
next series of coded output elements, and for generating the 
original digital data elements from such received signals. 

15 3. The system of claim 1, further including a multilevel 

modulator, coupled to the coded output elements of each 
systematic convolutional encoder, for generating an output 
modulated signal representative of at least some of such 
coded output elements. 

20 4. The system of claim 3, wherein the multilevel modu- 

lator generates a trellis code modulation. 

5. The system of claim 3, further including a demodulator 
for demodulating the output signal of the multilevel modu- 
lator into a data signal representative of at least some of the 

25 first series of coded output elements and of at least some of 
each next series of coded output elements, and a decoder, 
coupled to the demodulator, for generating the original 
digital data elements from the data signal. 

6. A system for error-correction coding of a plurality of 
30 sources of original digital data elements, comprising: 

(a) a first systematic convolutional encoder, coupled to 
each source of original digital data elements, for gen- 
erating a first set of series coded output elements 
derived from the original digital data elements; 

35 (b) at least one set of interleavers, each set coupled to 

respective sources of original digital data elements, for 
modifying the order of the original digital data ele- 
ments from the respective coupled sources to generate 
a respective set of interleaved elements; and 
40 (c) at least one next systematic convolutional encoder, 

each coupled to at least one set of interleaved elements, 
each for generating a corresponding next set of series 
coded output elements derived from the coupled sets of 
interleaved elements, each next set of series coded 
45 output elements being in parallel with the first set of 
series coded output elements. 

7. The system of claim 6, wherein the system for error- 
correction coding further outputs the original digital data 
elements. 

50 8. The system of claim 6, wherein the system for error- 

correction coding outputs only the first set of series coded 
output elements and each next set of series coded output 
elements. 

9. The stem of claim 6, further including a decoder for 
55 receiving signals representative of at least some of the first 
set of series coded output elements and of at least some of 
each next set of series coded output elements, and for 
generating the original digital data elements from such 
received signals. 

60 10. The system of claim 6, further including a multilevel 

modulator, coupled to the coded output elements, for gen- 
erating an output modulated signal representative of at least 
some of the coded output elements. 

11. The system of claim 10, wherein the multilevel 
65 modulator generates a trellis code modulation. 

12. The system of claim 10, further including a demodu- 
lator for demodulating the output signal of the multilevel 
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modulator into a data signal representative of at least some 
of the first set of series coded output elements and of at least 
some of each next set of series coded output elements, and 
a decoder, coupled to the demodulator, for generating the 
original digital data elements from the data signal. 5 

13. A system for error-correction coding of a source of 
original digital data elements comprising: 

(a) at least one interleaver, each coupled to the source of 
original digital data elements, for modifying the order 

of the original digital data elements to generate respec- 10 
tive interleaved elements; and 

(b) a single systematic recursive convolutional encoder 
module, coupled to the source of original digital data 
elements and to interleaved elements from at least one 
interleaver, for generating a set of coded output ele- 15 
ments derived from the original digital data elements; 

wherein the system for error-correction coding outputs the 
set of coded output elements and the original digital 
data elements. 

20 

14. The system of claim 13, further including a decoder 
for receiving signals representative of at least some of the set 
of coded output elements, and for generating the original 
digital data elements from such received signals. 

15. The system of claim 13, further including a multilevel ^ 
modulator, coupled to the set of coded output elements of the 
systematic convolutional encoder, for generating an output 
modulated signal representative of at least some of the set of 
coded output elements. 

16. The system of claim 15, wherein the multilevel 3Q 
modulator generates a trellis code modulation. 

17. The system of claim 15, further including a demodu- 
lator for demodulating the output signal of the multilevel 
modulator into a data signal representative of at least some 

of the set of coded output elements, and a decoder, coupled 35 
to the demodulator, for generating the original digital data 
elements from the data signal. 

18. A system for error-correction coding of a source of 
original digital data elements, comprising: 

(a) a first encoder, coupled to the source of original digital 40 
data elements, for generating a plurality of coded 
intermediate output elements derived from the original 
digital data elements; 

(b) at least one interleaver, each coupled to at least one of 
the plurality of coded intermediate output elements, for 45 
modifying the order of the coded intermediate output 
elements to generate respective interleaved output ele- 
ments; and 

(c) at least one systematic recursive convolutional 
encoder, each coupled to at least one interleaver, for 50 
generating a set of coded output elements derived from 
the interleaved output elements from each coupled 
interleaver. 

19. The system of claim 18, further including a decoder 
for receiving signals representative of at least some of the set 55 
of coded output elements, and for generating the original 
digital data elements from such received signals. 

20. The system of claim 18, further including a multilevel 
modulator, coupled to each set of coded output elements, for 
generating an output modulated signal representative of at 60 
least some of the set of coded output elements. 

21. The system of claim 20, wherein the multilevel 
modulator generates a trellis code modulation. 

22. The system of claim 20, further including a demodu- 
lator for demodulating the output signal of the multilevel 65 
modulator into a data signal representative of at least some 

of the coded output elements, and a decoder, coupled to the 
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demodulator, for generating the original digital data ele- 
ments from the data signal. 

23. A system for error-correction coding of a source of 
original digital data elements, comprising: 

(a) a first systematic convolutional encoder, coupled to the 
source of original digital data elements, for generating 
a first series of coded output elements derived from the 
original digital data elements; 

(b) at least one interleaver, each coupled to the source of 
original digital data elements, for modifying the order 
of the original digital data elements to generate respec- 
tive interleaved elements; 

(c) at least one next systematic convolutional encoder, 
each coupled to respective interleaved elements, each 
for generating a corresponding next series of coded 
output elements derived from the respective interleaved 
elements, each next series of coded output elements 
being in parallel with the first series of coded output 
elements; and 

(d) a multilevel modulator, directly coupled to the original 
digital data elements and to the coded output elements 
of each systematic convolutional encoder, for generat- 
ing an output modulated signal representative of at least 
some of such original digital data elements and coded 
output elements. 

24. The system of claim 23, wherein the multilevel 
modulator generates a trellis code modulation. 

25. A system for error-correction coding and multilevel 
modulation of a plurality of sources of original digital data 
elements, comprising: 

(a) a first systematic convolutional encoder, coupled to 
each source of original digital data elements, for sys- 
tematically selecting a first subset of the original digital 
data elements and generating a first series of coded 
output elements derived from the first selected subset, 
and for outputting at least one source of original digital 
data elements unchanged; 

(b) at least two interleavers, each coupled to a respective 
one of the plurality of sources of original digital data 
elements, for modifying the order of the original digital 
data elements to generate respective sets of interleaved 
elements; 

(c) at least one next systematic convolutional encoder, 
each coupled to at least two interleavers, each for 
systematically selecting a next subset from the sets of 
interleaved elements different from each other selected 
subset, for generating a corresponding next series of 
coded output elements derived from the corresponding 
next subset, each next series of coded output elements 
being in parallel with the first series of coded output 
elements, and for outputting at least one set of inter- 
leaved elements unchanged; and 

(d) a multilevel modulator, coupled to the original digital 
data elements, the unchanged interleaved elements, and 
the coded output elements, for generating an output 
modulated signal representative of at least some of such 
original digital data elements, unchanged interleaved 
elements, and coded output elements. 

26. The system of claim 25, wherein the multilevel 
modulator generates a trellis code modulation. 

27. A system for terminating a turbo encoder comprising: 

(a) a plurality of serially connected delay elements D 
having a tap after each delay element; 

(b) a plurality of first selective combinatorial devices, at 
least one before the first serially connected delay ele- 
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ment D, at least one after the last serially connected 
delay element D, and at least one between each inter- 
mediate pair of serially connected delay elements D; 

(c) at least one data source line u b , where b is the number 
of data source lines, each coupled to each first selective 5 
combinatorial device as input lines; 

(d) at least one set of next selective combinatorial devices, 
each set comprising a plurality of selective combina- 
torial devices each coupled to a corresponding tap and 
serially coupled together, with an end selective com- 10 
binatorial device of each set selectively coupled to a 
corresponding data source line; 

wherein, to terminate input to the delay elements D, the 
sets of next selective combinatorial devices are coupled 
to the corresponding data source line and selectively 
actuated to select tap coefficients a l0 , . . . a ( - m _ 1 for i=l, 

2, ... b, to apply to a corresponding data source line, 
wherein the tap coefficients are obtained by repeated 
use of the following equation, and by solving the 
resulting equations: 


J^ufh i (D) + DS k - 1 (D) 


mod h 0 (D) 
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where S*(D) is the state of the turbo encoder at time k with 
coefficients S* 0 , S k l9 . . . S k m _ ± for input u* 1? . . . u k b , and 
termination in state zero is achieved in at most m clock 
cycles. 30 

28. A decoder system for decoding a plurality of 
sequences of received signals y £ , representative of code 
elements x- generated by a turbo encoder from a source of 
original digital data elements u £ , into decoded elements 
corresponding to the original digital data elements u £ , the 35 
decoder system comprising: 

(a) at least three decoder modules, each having a received 
signal input i, a feedback input, and an output, the 
output of each decoder module being coupled to the 
feedback input of each other decoder module; and 40 

(b) a summing module, coupled to each output of each 
decoder module, for generating final decoded elements 
from the outputs of the decoder modules; 

wherein each sequence of received signals y £ is coupled to 
the received signal input i of a corresponding decoder 45 
module. 

29. The decoder system of claim 28, wherein each 
decoder module includes: 

(a) a feedback input comprising a combinatorial element; 

(b) a permuter, coupled to the combinatorial element; 

(c) a probability-based decoder, coupled to the permuter 
and including a received signal input; 

(d) an inverse permuter, coupled to the probability -based 

decoder; 55 

(e) a differential combinatorial element, coupled to the 
inverse permuter; and 

(f) a delay element, coupled between the combinatorial 
element and the differential combinatorial element. 

30. The decoder system of claim 29, wherein the 60 
probability-based decoder uses the maximum a posteriori 
probability algorithm. 

31. The decoder system of claim 29, wherein the 

probability-based decoder uses the soft output Viterbi algo- 
rithm. 65 

32. An iterative decoder system for decoding at least one 
sequence of received signals y i9 representative of code 


elements x t - generated by a self -concatenated encoder from 
a source of original digital data elements u £ , into decoded 
elements corresponding to the original digital data elements 
u f , the decoder system comprising: 

(a) a plurality of feedback inputs each comprising a 
combinatorial element; 

(b) a plurality of permuters, each coupled to a correspond- 
ing combinatorial element; 

(c) a probability -based decoder, coupled to each permuter 
and at least one combinatorial element, and including a 
received signal input and an output; 

(d) a plurality of inverse permuters, each coupled to the 
probability-based decoder so as to receive a signal 
associated with a corresponding permuter; 

(e) a plurality of differential combinatorial elements, one 
coupled to the output of the probability -based decoder 
and each other coupled to a corresponding inverse 
permuter, and each coupled to every non- 
corresponding feedback input; and 

(f) a plurality of delay elements, each coupled between a 
corresponding combinatorial element and a corre- 
sponding differential combinatorial element; 

wherein each sequence of received signals y £ is coupled to 
the received signal input of the probability-based 
decoder. 

33. The decoder system of claim 32, wherein the 
probability -based decoder uses the maximum a posteriori 
probability algorithm. 

34. The decoder system of claim 32, wherein the 
probability-based decoder uses the soft output Viterbi algo- 
rithm. 

35. An iterative decoder system for decoding at least one 
sequence of received signals y £ , representative of code 
elements x t - generated by a serial encoder from a source of 
original digital data elements u i9 into decoded elements 
corresponding to the original digital data elements u £ , the 
decoder system comprising: 

(a) a plurality of permuters each having an input; 

(b) at least one first probability -based decoder, each 
coupled to a corresponding permuter, and including a 
received signal input; 

(c) a plurality of inverse permuters, each coupled to a 
corresponding probability-based decoder so as to 
receive a signal associated with a corresponding per- 
muter; 

(d) a plurality of first differential combinatorial elements, 
each coupled to a corresponding inverse permuter; 

(e) a plurality of first delay elements, each coupled 
between the input of a corresponding permuter and a 
corresponding first differential combinatorial element; 

(f) a second probability -based decoder, coupled to each 
first differential combinatorial element, and including 
inputs corresponding to each first differential combi- 
natorial element, and an output; 

(g) a plurality of second differential combinatorial 
elements, each coupled to the second probability-based 
decoder and to the input of a corresponding permuter; 

(h) a plurality of second delay elements, each coupled 
between corresponding inputs of the second 
probability-based decoder and a corresponding second 
differential combinatorial element; 

wherein the sequence of received signals y £ is coupled to 
corresponding received signal inputs of the first 
probability -based decoders. 
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36. The decoder system of claim 35, wherein at least one 
probability-based decoder uses the maximum a posteriori 
probability algorithm. 

37. The decoder system of claim 35, wherein at least one 
probability-based decoder uses the soft output Viterbi algo- 5 
rithm. 

38. A method for error-correction coding of a source of 
original digital data elements, comprising the steps of: 

(a) generating a first series of systematic convolutional 
encoded output elements derived from a source of 10 
original digital data elements; 

(b) modifying the order of the original digital data ele- 
ments to generate at least one set of respective inter- 
leaved elements; and 

(c) generating at least one corresponding next series of 
systematic convolutional encoded output elements 
derived from a corresponding set of respective inter- 
leaved elements, each next series of coded output 
elements being in parallel with the first series of coded 2Q 
output elements; 

(d) outputting only the first series of systematic convo- 
lutional encoded output elements and each next series 
of systematic convolutional encoded output elements. 

39 . The method of claim 38, further including the steps of: 2 5 

(a) receiving signals representative of at least some of the 
first series of systematic convolutional encoded output 
elements and of at least some of each next series of 
systematic convolutional encoded output elements; and 

(b) generating the original digital data elements from such 30 
received signals. 

40. The method of claim 38, further including the step of 
generating an output multilevel modulated signal represen- 
tative of at least some of such coded output elements. 

41. The method of claim 40, wherein the multilevel 35 
modulation is trellis code modulation. 

42. The method of claim 40, further including the steps of: 

(a) demodulating the output multilevel modulated signal 
into a data signal representative of at least some of the 
first series of systematic convolutional encoded output 40 
elements and of at least some of each next series of 
systematic convolutional encoded output elements; 

(b) generating the original digital data elements from the 

data signal. 45 

43. A method for error-correction coding of a plurality of 
sources of original digital data elements, comprising the 
steps of: 

(a) generating a first set of series systematic convolutional 
encoded output elements derived from a plurality of 50 
sources of original digital data elements; 

(b) modifying the order of the original digital data ele- 
ments from the respective sources to generate at least 
one set of respective interleaved elements; and 

(c) generating at least one corresponding next set of series 55 
systematic convolutional encoded output elements 
derived from a corresponding set of respective inter- 
leaved elements, each next set of series systematic 
convolutional encoded output elements being in paral- 
lel with the first set of series systematic convolutional 60 
encoded output elements. 

44. The method of claim 43, including the further the step 
of outputting the original digital data elements. 

45. The method of claim 43, including the further the step 

of outputting only the first set of series systematic convo- 65 
lutional encoded output elements and each next set of series 
systematic convolutional encoded output elements. 
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46. The method of claim 43, further including the steps of: 

(a) receiving signals representative of at least some of the 
first set of series systematic convolutional encoded 
output elements and of at least some of each next set of 
series systematic convolutional encoded output ele- 
ments; and 

(b) generating the original digital data elements from such 
received signals. 

47. The method of claim 43, further including the step of 
generating an output multilevel modulated signal represen- 
tative of at least some of such systematic convolutional 
encoded output elements. 

48. The method of claim 47, wherein the multilevel 
modulation is trellis code modulation. 

49. The method of claim 47, further including the steps of: 

(a) demodulating the output multilevel modulated signal 
into a data signal representative of at least some of the 
first set of series systematic convolutional encoded 
output elements and of at least some of each next set of 
series systematic convolutional encoded output ele- 
ments; and 

(b) generating the original digital data elements from the 
data signal. 

50. A method for error-correction coding of a source of 
original digital data elements, comprising the steps of: 

(a) modifying the order of a source of original digital data 
elements to generate at least one set of interleaved 
elements; and 

(b) generating a set of systematic convolutional encoded 
output elements derived from the original digital data 
elements and at least one set of interleaved elements; 

(c) outputting only the set of systematic convolutional 
encoded output elements. 

51. The method of claim 50, further including the steps of: 

(a) receiving signals representative of at least some of the 
set of systematic convolutional encoded output ele- 
ments; and 

(b) generating the original digital data elements from such 
received signals. 

52. The method of claim 50, further including the step of 
generating an output multilevel modulated signal represen- 
tative of at least some of such systematic convolutional 
encoded output elements. 

53. The method of claim 52, wherein the multilevel 
modulation is trellis code modulation. 

54. The method of claim 52, further including the steps of: 

(a) demodulating the output multilevel modulated signal 
into a data signal representative of at least some of the 
set of systematic convolutional encoded output ele- 
ments; and 

(b) generating the original digital data elements from the 
data signal. 

55. A method for error-correction coding of a source of 
original digital data elements, comprising the steps of: 

(a) generating a plurality of systematic convolutional 
encoded intermediate output elements derived from a 
source of original digital data elements; 

(b) modifying the order of the systematic convolutional 
encoded intermediate output elements to generate at 
least one set of respective interleaved output elements; 
and 

(c) generating a set of systematic convolutional encoded 
output elements derived from at least one set of inter- 
leaved output elements. 
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56. The method of claim 55, further including the steps of: 

(a) receiving signals representative of at least some of the 
set of systematic convolutional encoded output ele- 
ments; and 

(b) generating the original digital data elements from such 5 
received signals. 

57. The method of claim 55, further including the step of 

generating an output multilevel modulated signal represen- 
tative of at least some of such systematic convolutional 
encoded output elements. 10 

58. The method of claim 57, wherein the multilevel 
modulation is trellis code modulation. 

59. The method of claim 57, further including the steps of: 

(a) demodulating the output multilevel modulated signal 
into a data signal representative of at least some of the 
set of systematic convolutional encoded output ele- 
ments; and 

(b) generating the original digital data elements from the 

data signal. 20 

60. A method for error-correction coding of a source of 
original digital data elements, comprising the steps of: 

(a) generating a first series of systematic convolutional 

encoded output elements derived from a source of 
original digital data elements; 25 

(b) modifying the order of the original digital data ele- 
ments to generate at least one set of interleaved ele- 
ments; 

(c) generating at least one next series of systematic 
convolutional encoded output elements derived from at 30 
least one set of interleaved elements, each next series of 
systematic convolutional encoded output elements 
being in parallel with the first series of systematic 
convolutional encoded output elements; and 

(d) generating an output multilevel modulated signal 
directly from and representative of at least some of 
such original digital data elements and systematic con- 
volutional encoded output elements. 

61. The method of claim 60, wherein the multilevel 4Q 
modulation is trellis code modulation. 

62. A method for error-correction coding and multilevel 
modulation of a plurality of sources of original digital data 
elements, comprising the steps of: 

(a) systematically selecting a first subset of original digital 45 
data elements from a plurality of sources of original 
digital data elements; 

(b) generating a first series of systematic convolutional 

encoded output elements derived from the first selected 
subset; 50 

(c) outputting at least one source of original digital data 
elements unchanged; 

(d) modifying the order of the original digital data ele- 

ments to generate at least two sets of interleaved 
elements; 55 


(e) systematically selecting a next subset from the sets of 
interleaved elements different from each other selected 
subset; 

(f) generating at least one next series of systematic 60 

convolutional encoded output elements derived from a 
corresponding next subset, each next series of system- 
atic convolutional encoded output elements being in 
parallel with the first series of systematic convolutional 
encoded output elements; 65 

(g) outputting at least one set of interleaved elements 
unchanged; and 


(h) generating an output multilevel modulated signal 
representative of at least some of such original digital 
data elements, unchanged interleaved elements, and 
systematic convolutional encoded output elements. 

63. The method of claim 62, wherein the multilevel 
modulation is trellis code modulation. 

64. A method for terminating input to a turbo encoder 
comprising a plurality of serially connected delay elements 
D having a tap after each delay element; a plurality of first 
selective combinatorial devices, at least one before the first 
serially connected delay element D, at least one after the last 
serially connected delay element D, and at least one between 
each intermediate pair of serially connected delay elements 
D; at least one data source line u b , where b is the number of 
data source lines, each coupled to each first selective com- 
binatorial device as input lines; and at least one set of next 
selective combinatorial devices, each set comprising a plu- 
rality of selective combinatorial devices each coupled to a 
corresponding tap and serially coupled together, with an end 
selective combinatorial device of each set selectively 
coupled to a corresponding data source line; the method 
comprising the steps of: 

(a) coupling the sets of next selective combinatorial 
devices to the corresponding data source line; 

(b) selectively actuating the sets of next selective combi- 
natorial devices to select tap coefficients a l0 , . . . a i m _ 1 
for i=l, 2 , ... b, to apply to a corresponding data source 
line, wherein the tap coefficients are obtained by 
repeated use of the following equation, and by solving 
the resulting equations: 


^ ufh;(D) + AS"* -1 (D) 


mod h 0 (D) 


where S*(D) is the state of the turbo encoder at time k with 
coefficients S* 0 , S k l7 . . . S* m-1 for input u\, . . . u k b , and 
termination in state zero is achieved in at most m clock 
cycles. 

65. A method for decoding a plurality of sequences of 
received signals y £ , representative of systematic convolu- 
tional encoded elements x £ generated from a source of 
original digital data elements u £ , into decoded elements 
corresponding to the original digital data elements u £ , the 
method comprising the steps of: 

(a) coupling at least three decoder modules, each having 
a received signal input i, a feedback input, and an 
output, such that the output of each decoder module is 
coupled to the feedback input of each other decoder 
module; 

(b) applying each sequence of received signals y £ to the 
received signal input i of a corresponding decoder 
module; and 

(c) summing the output of each decoder module to 
generate final decoded elements. 

66. The method of claim 65, wherein each decoder 
module includes: 

(a) a feedback input comprising a combinatorial element; 

(b) a permuter, coupled to the combinatorial element; 

(c) a probability-based decoder, coupled to the permuter 
and including a received signal input; 

(d) an inverse permuter, coupled to the probability-based 
decoder; 

(e) a differential combinatorial element, coupled to the 
inverse permuter; and 
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(f) a delay element, coupled between the combinatorial 
element and the differential combinatorial element. 

67. The method of claim 66, wherein the prob ability - 

based decoder uses the maximum a posteriori probability 
algorithm. 5 

68. The decoder method of claim 66, wherein the 
probability-based decoder uses the soft output Viterbi algo- 
rithm. 

69. An iterative method for decoding at least one sequence 

of received signals y £ , representative of systematic convo- 10 
lutional encoded elements x £ generated by a self- 
concatenated encoder from a source of original digital data 
elements u £ , into decoded elements corresponding to the 
original digital data elements u £ , the method comprising the 
steps of: 15 

(a) applying feedback signals to inputs of a plurality of 
combinatorial elements to generate first output signals; 

(b) applying the first output signals to a plurality of 
permuters to generate second output signals; 

(c) applying selected ones of the first and second output 
signals, and each sequence of received signals y £ , to a 
probability -based decoder to generate third output sig- 
nals and a decoded output for decoded elements; 

(d) applying the third output signals to a plurality of 2 s 
inverse permuters to generate fourth output signals; 

(e) applying the fourth output signals to a plurality of 
differential combinatorial elements to generate feed- 
back signals; 

(f) applying the feedback signals to non-corresponding 30 
inputs of the plurality of combinatorial elements; and 

(g) coupling a plurality of delay elements between a 
corresponding combinatorial element and a corre- 
sponding differential combinatorial element. 

70. The decoder method of claim 69, wherein the 35 
probability-based decoder uses the maximum a posteriori 
probability algorithm. 

71. The decoder method of claim 69, wherein the 
probability-based decoder uses the soft output Viterbi algo- 
rithm. 
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72. An iterative method for decoding at least one sequence 
of received signals y £ , representative of code elements x £ 
generated by a serial encoder from a source of original 
digital data elements u £ , into decoded elements correspond- 
ing to the original digital data elements u £ , the method 
comprising the steps of: 

(a) applying feedback signals to respective inputs of a 
plurality of permuters to generate first output signals; 

(b) applying the first output signals, and the sequence of 
received signals y £ , to at least one first probability- 
based decoder to generate second output signals; 

(c) applying the second output signals to a plurality of 
inverse permuters to generate third output signals; 

(d) applying the third output signals to a plurality of first 
differential combinatorial elements to generate fourth 
output signals; 

(e) coupling a plurality of first delay elements between 
respective inputs of corresponding permuters and cor- 
responding first differential combinatorial elements; 

(f) applying the fourth output signals to a second 
probability-based decoder to generate fifth output sig- 
nals and a decoded output for decoded elements; 

(g) applying the fifth output signals to a plurality of 
second differential combinatorial elements, to generate 
feedback signals coupled to the second probability- 
based decoder and to the inputs of corresponding 
permuters; 

(h) coupling a plurality of second delay elements between 
corresponding inputs of the second probability-based 
decoder and corresponding second differential combi- 
natorial elements. 

73. The method of claim 72, wherein at least one 
probability -based decoder uses the maximum a posteriori 
probability algorithm. 

74. The method of claim 72, wherein at least one 
probability -based decoder uses the soft output Viterbi algo- 
rithm. 
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